Search CORE

179 research outputs found

Hashing-Based-Estimators for Kernel Density in High Dimensions

Author: Charikar Moses
Siminelakis Paris
Publication venue
Publication date: 30/08/2018
Field of study

Given a set of points

P\subset \mathbb{R}^{d}

and a kernel

k

, the Kernel Density Estimate at a point

x\in\mathbb{R}^{d}

is defined as

\mathrm{KDE}_{P}(x)=\frac{1}{|P|}\sum_{y\in P} k(x,y)

. We study the problem of designing a data structure that given a data set

P

and a kernel function, returns *approximations to the kernel density* of a query point in *sublinear time*. We introduce a class of unbiased estimators for kernel density implemented through locality-sensitive hashing, and give general theorems bounding the variance of such estimators. These estimators give rise to efficient data structures for estimating the kernel density in high dimensions for a variety of commonly used kernels. Our work is the first to provide data-structures with theoretical guarantees that improve upon simple random sampling in high dimensions.Comment: A preliminary version of this paper appeared in FOCS 201

arXiv.org e-Print Archive

Crossref

Recovery Guarantees for Quadratic Tensors with Limited Observations

Author: Charikar Moses
Liang Yingyu
Sharan Vatsal
Zhang Hongyang
Publication venue
Publication date: 31/10/2018
Field of study

We consider the tensor completion problem of predicting the missing entries of a tensor. The commonly used CP model has a triple product form, but an alternate family of quadratic models which are the sum of pairwise products instead of a triple product have emerged from applications such as recommendation systems. Non-convex methods are the method of choice for learning quadratic models, and this work examines their sample complexity and error guarantee. Our main result is that with the number of samples being only linear in the dimension, all local minima of the mean squared error objective are global minima and recover the original tensor accurately. The techniques lead to simple proofs showing that convex relaxation can recover quadratic tensors provided with linear number of samples. We substantiate our theoretical results with experiments on synthetic and real-world data, showing that quadratic models have better performance than CP models in scenarios where there are limited amount of observations available

arXiv.org e-Print Archive